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Tight Bound on Randomness for Violating the 
Clauser-Horne-Shimony-Holt Inequality 

Yifeng Teng, Shenghao Yang, Member, IEEE, Siwei Wang and Mingfei Zhao 


Abstract —Free will (or randomness) has been studied to 
achieve loophole-free Bell’s inequality test and to provide device¬ 
independent quantum key distribution security proofs. The 
required randomness such that a local hidden variable model 
(LHVM) can violate the Clauser-Horne-Shimony-Holt (CHSH) 
inequality has been studied, but a tight bound has not been 
proved for a practical case that i) the device settings of the 
two parties in the Bell test are independent; and ii) the device 
settings of each party can be correlated or biased across different 
runs. Using some information theoretic techniques, we prove in 
this paper a tight bound on the required randomness for this 
case such that the CHSH inequality can be violated by certain 
LHVM. Our proof has a clear achievability and converse style. 
The achievability part is proved using type counting. To prove 
the converse part, we introduce a concept called profile for a 
set of binary sequences and study the properties of profiles. Our 
profile-based converse technique is also of independent interest. 

Index Terms —Bell’s inequality test, CHSH inequality, random¬ 
ness loophole, randomness bound 


I. Introduction 

Bell’s inequality test [Tj| provides an approach to verify the 
existence of physical phenomenon that cannot be explained by 
local hidden variable models (LHVMs). The Clauser-Horne- 
Shimony-Holt (CHSH) inequality Q is the most often used in¬ 
equality in Bell test experiments. Experimental demonstrations 
of the violation of CHSH inequalities have been conducted 
since 1982 0 (see also Giustina et al.’s work E3 and the 
references therein). These Bell tests, however, suffer from an 
inherent loophole that the settings of the participated devices 
may not be chosen totally randomly, called the randomness 
(free will) loophole. A small amount of correction between 
the device settings makes it possible that a LHVM can 
reproduce predictions of quantum mechanics 0-0. This 
loophole also weakens the Bell’s inequality based security 
proofs of device-independent quantum key distribution ED- 
02 and randomness expansion 02-03. 

One of the essential questions in the randomness loophole is 
the bound of randomness such that the correctness of Bell tests 
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can (or cannot) be guaranteed 0 - 0 . m-m. Using a min- 
entropy type randomness measure, the bound of randomness 
required in a CHSH inequality test can be formulated as an 
optimization problem, and various special cases have been 
solved ED, ED, 00. One case that has not been completely 
resolved in the literature is that the two parties of the test 
have independent settings, but the setting of each party can 
be biased or correlated across different runs. In this paper, 
we study this case and obtain the asymptotic optimal value 
explicitly. 

A. Problem Formulation 

Let n be a positive integer, and A', Y be two random 
variables over {0, l} n with a joint distribution pxy- We may 
consider that X and Y are the device settings of the two parties 
in an n-run Bell test, respectively. The following randomness 
measure has been used in the literature: 

\ !/« 

max p X y(x, y) ) 

x,ye{0,l}™ ) 

When X and Y are independent and uniformly distributed, 
P = 1/4, which is the minimum value of P and corresponds 
to the case of complete randomness. When X and Y are 
deterministic, P = 1, which corresponds to the case of zero 
randomness. Note that P is related to the min-entropy: 

Hoo (A, Y) := — log max p X y (x, y) = -n log P. 
x,ye{0,l} n 

Regard the vectors x G {0,1}" as column vectors and 
denote by x T the transpose of the x. The optimization problem 
of interest is 

min P 

PXY 

s.t. - Y x T y pxy(x, y) < 4 „ Sq , '' 1 

n 8 

X,y 

where Sq = ‘2\/2 is a quantum constant. Readers may refer to 
0, ED, ED to see how this problem is obtained. Optimiza¬ 
tion E can be simplified to a linear programming |18]. When 
n = 1, the optimal value of E is (Sq + 4)/24 ss 0.285, 
which was shown by Hall 0 and Koh et al. ED . When 
n —> oo, Pope and Kay ED showed that the optimal value of 
-Sq - 4 _h I 4 ~ s Q A 

E converges to 3 5 2 b v 8 ) ta 0.258, where 

hb(t) = —t log 2 t — (1 - f)log 2 (l - t) 

is the binary entropy function. 

The case that X and Y are independent is of particular 
interest. Towards a loophole free Bell test, physicists have 
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designed experiments with independent device settings li2Til . 
In quantum key distribution, the experimental devices of the 
two parties may be manufactured independently and separated 
spatially, reducing the potential correlation of the device 
settings generated by the adversary. For independent device 
settings, the corresponding optimization problem becomes 

min P 

PX,PY 

s.t ^^x T ypxu(x,y) < 4 & Sq (2) 

x,y 

pxy (*, y) =PA-(x)pr(y)- 

Note that the above problem is not derived by directly impos¬ 
ing the constraint p_\'i’(x, y) = px(x.)py( y) to 0. For the 
completeness, we briefly discuss how (0 is derived from the 
corresponding CHSH inequality test problem in Appendix Q] 
When n = 1, it was obtained by Koh et al. Ifl6l that 
the optimal value of 0 is Sq/8 ~ 0.354. Let Pq be the 
limit of the optimal value of 0, when n —> oo. The value 
of Pq has the following interpretation. For any independent 
device settings with randomness less than Pq, it is not possible 
to have a LHVM that violates CHSH inequality. But for 
any value P > Pq, there exists a LHVM that violates 
CHSH inequality where the device settings are independent, 
but have randomness less than or equal to P. Therefore, we 
are motivated to study the value of Pq for CHSH inequality 
test. Yuan, Cao and Ma m have shown numerically that 
Pq $ 0.264. 


B. Our Contribution 

In this paper, we provide an exact characterization of Pq, 
and hence close the unresolved case in Table Q] Particularly, 
we show that 


P Q = 4- ft b(yco) = 0.26428 


where cq = 4 ^ Q « 0.1464. Our formula has a min-entropy 
interpretation: —n\og 2 Pq = 2nhb(y/co), he., each bit in A' 
and Y has an average min-entropy hb(y/cQ). 

To prove achievability, we simplify 0 by introducing 
an extra constraint that both X and Y have the uniform 
distribution over A n ,i, the set of sequences in {0, l} n with 
at most nl Is, and obtain a new optimization problem 


min 

i 

s.t. 


(1 /\An, 
1 

n\A n ,i\ 


\2/n 


■ E 

x.yG.A„ 


T ^ 4 — Sq 

x y < - 


(2') 


which is essentially the same problem studied in lfl9l Section 
IV-B], The asymptotic optimal value of 01 when n —> oo, 
denoted by Pq, gives an upper bound on Pq since is 
obtained by reducing the feasible region of 0. The numerical 
bound on Pq in |fl9i can be made analytical, and it shows that 
Pq < 4 -Mn/cq) anc j hence Pq < 4 ~ hb (V^Q). 

The major part of our paper is to show the converse that no 
distributions of X and Y with randomness less than ■Y hl ' ( C 7 Y> 
can be feasible for 0, i.e., Pq > 4 Note that we 

cannot use 0} as the starting point to prove the converse since 
the derivation of 0} implies Pq > Pq. It is possible to show 


TABLE I 

Previous results. 


n = 1 
n —y oo 


correlated devices 
4)/24 R* 0.285 


(Sq 


Zq. 


: 0.258 


independent devices 
Sq/8 « 0.354 


< 0.264 


that Pq > 4 but not Pq > 4 h ^V^Q) by studying 

only 0}. 

To prove converse, we introduce a concept called profile 
to characterize a set of binary sequences. We study some 
properties of profiles, based on which optimization 0 is 
simplified and the converse is proved. The technique of profile 
seems to be firstly used here and may of independent interest 
for other problems. 

In the remainder of this paper, our techniques used to prove 
the main result are summarized in the next section, followed 
by the details in Section |III1 Some concluding remarks are 
given in Section El 

II. Outline of the Proofs 
A s described in the previous section, we formulate an 
optimization problem as follows. 

Problem 1. For any given c G (0,1/4] and every positive 
integer n, consider the following program 

f Y /n 

min maxpx(x) maxpy (y) , 

px ,py \ x y J 

s.t. - V px{yt)p Y { y)x T y<c, 

n 

x,yE{0,l}» 

where px and py are probability distributions over {0, l} n . 
Let P n be the optimal value of the above program. We are 
interested in the limit of the sequence {P n } when n —> oo. 

Specifically we will need the case that c = cq for the 
physics problem of interests. Now we state the following 
theorem. 

Theorem 1. For Problem |7) with c = cq, lim P n = 

n—> oo 

4~ h b(y/CQ), w h ere 

hb(t) = -tlog 2 t- (1 — f)log 2 (l -f) (3) 

is the binary entropy function. 

In the following of this section, we give an outline of the 
main techniques towards proving this theorem. We have the 
following bound for P„. 

Proposition 1. For all sufficiently large n, 1 / I < P„ < 1/2. 


A. Simplified Problem 

Let Sx and Sy be the support of distributions px and py, 
respectively. Problem Q] can be simplified if we only consider 
distributions that are uniform over support. Suppose that 


Px (x) 

py{ y) 


l 


\SxV 

l 

\S Y \' 


Vx G S x , 


Vy G Sy. 
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Then we have 


maxpx(x) maxpy(y) 


1/n 


VI^A'I • |SV| ’ 


and 


- E Px(*)py( y)x T y= — 

n z —' 

x,yG{0,l}" 

Define a new problem as follows: 


xeSx.yeSy 


x y 


?i|5,y| • |5y| 


Problem 2. For any given c £ (0,1/4] and every positive 
integer n, consider the following programming 


min 

Sx,Sy 


s.t. 


1 


^/|5a-| • |Sy| 
1 


n\Sx\ ■ |Sy| 


X! xT y < c > 


(4) 


xeSx.yGSy 


where Sx and Sy are subsets of {0, 1}™. Let P' n be the 
optimal value of the above program. We are interested in the 
limit of the sequence { Pf } when n —> oo. 


It is obvious that P n < P' n since only distributions that 
are uniform over support are considered in Problem [2] The 
following theorem enables us to focus on lim n If. 


Theorem 2. lim^oo Pf/P n = 1. 


II. Profiles 

To study the properties of a set of binary vectors, we 
introduce the concept of profile. For any positive integer m, 
we call vector a = (ai, 02, ■ ■ ■ , a m ) £ [0, l] m a profile or an 
m-profile. For each S C {0, l} ra , define the profile of set S as 

|S|>0; 

r (S) = \\ s \ its 

[ (0,0,..., 0), |5|=0. 

We see that r(5) is an n-profile. 

Define the characteristic function of an m-profile a as f a : 
[0,1] —> [0,1] such that 


fa{t) 


ai, t = 0 ; 
a^tm] ) ^0 I — 1- 


The characteristic function of a profile is a step function. For 
two profiles a and b, we say a < b if for any 0 < r < 
fair) < fb(f), where a and b may not include the same 
number of components. For a vector a, we denote by a, the 
*-th component of a. 


Lemma 1. For two n profiles a and b, —a T b = 

fo fa{t)fb(t)dt. 


Proof: We write according to the definition that 



n 


■ ^2 


i= 1 


■E 


ni/r 


=1 4(i-l)/n 


fa{t)fb(t)dt 


f fa(t)fb(t)dt, 

Jo 


where the second equality holds due to the fact that the 
characteristic function of a profile is a step function. ■ 

The following lemma tells us how to represent the constraint 
in Problem [2] in a simple way using profiles. 

Lemma 2. In Problem\2\ the left hand side of constraint (0) 
can be expressed as 


1 

n\Sx\ ■ |Sy| 


E xT y = 

xeSx.yeSV 



n 


where a = T(Sx) and b = r(Sy). 
Proof: We can write 


1 


n ISjrl |<Sy| 



n 15x1 |Sy| 

1 1 1 


n |5x| |5y| 

= — a T b , 
n 


(|5x|a) T (|5y|6) 


(5) 


where ([5]) follows from the definition of the profile of a set of 
binary vectors. ■ 

The following theorem states that to get the value of P' n , 
we only need to consider Sx and Sy with certain monotone 
property of their profiles. 

Theorem 3. For all n, there exist Sx,Sy C {0,1}™ that 
achieve Pf in Problem^ such that for a = F(Sx) and b = 
r(Sy), 0.5 > ai > ci2 > ... > a„. > 0 and 0 < b± < &2 < 
... < b n < 0.5. 

By Theorem^ it is sufficient for us to consider only profiles 
a £ [0, 0.5] m . For each m-profile a, define its n-volume to be 


V n (a) = max{|5| : S C {0,1}™, r(5) < a} , (6) 


where n may not be the same as m. 

Lemma 3. For any two profiles p and q, if p < q, we have 
Vn{p) < V n (q) for every positive integer n. 

Proof: Notice that for any n, any n-profile smaller than 
p is smaller than q, then the lemma suffices. ■ 

The following theorem gives an upper bound on the volume 
of a profile, which will be used in the proof of the lower bound 
on P’ n . 

Theorem 4. Fix an integer m and let a £ [0,0.5]™ be an 
m-profile. For any positive integer n, the n-volume of profile 
a satisfies 

V n {a)< 2%^=i hb( ~ ai) +°W), (7) 

where h\, is the binary entropy function defined in (|3|l and 
o(l) —> 0 as n —> 00 . 
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C. Converse and Achievability 

Theorem 5. For any sequence of Sx, Sy Q {0, 1}™ such that 
1 


n\S x \ ■ |Sy| 


Y xT y < C Q> 


xgSx ,yeS Y 


we have 


1 


lim inf - ,_ 

"^°° VI^A'IISyl 


> 


4-^b (y/CQ) ' 


n| 5 x| • |Sy| 


xeSx,yeSy 


and 


lim , 1 =4: ~h b (V^). 

"^°° y/\S x \\S Y \ 


2 n (2 n — 1) n 


2 " _ 1 ( 2 " - 1 ) 


yG{0,l}" 

2 4n- i; < c> 


and 


P n < max p x (x) max py (y) 


i/ 7 * 


B. Proof of Theorem [2] 

Suppose that px and py on {0,1}" achieve the minimum 
objective value P n in Problem [I] Write 

Y Px(x)py(y)x T y = ^px(x)6> pr (x), 
x,y£{0,l}“ x 


where 


We then give a construction of Sx and Sy to show that the 
bound in Theorem [5] is tight. 

Theorem 6. There exists a sequence of Sx,Sy C {0,1}" 
such that 

1 Y —"V rp 

2_ X y <CQ, 


f) pv ( x ) = x T ( ^py(y)y 


Now we are ready to prove Theorem Q] 

Proof of Theorem [7} Theorem [3 implies that 

lim inf P' n >A~ hb ^), 

n—f 00 

and Theorem [6] implies that 

lim sup P' n < A~ hb (^). 

n—f 00 

Thus lim n _ > . 00 = 4 _/lb (v^Q) 9 which together with Theo¬ 
rem Hip roves Theorem [I] ■ 

III. Proofs 

A. Proof of Proposition [7] 

The lower bound follows from max x p x (xj > 1/2" for 
any distribution px over {0,1}". To prove the upper bound, 
consider the following two distributions: 

J 1 - 2c, x = 0 

PX(X) \2c/(2"-l), x^O, 

where c £ (0,1/4] as given in Problem [2] and py( y) = 1/2" 
for all y £ {0,1}". We the have 

- Y Px( x )py( y)x T y 

x.yefo.i}" 

2c 1 Y —-v p 

- Y x y 

n x —•* 


Let Px = max x px(x). We know that Px > 0. If Px = 1, 
then there exists xo such that px( x 0 ) = 1. In this case, 
P n = 1/2 since otherwise we may instead choose p \ such 
that px{ 0) = 1 and py such that py( y) = 1/2" for all 
y £ {0,1}". Thus we have a contradiction to P n < 1/2 (see 
Proposition 0}. Therefore, 0 < Px < 1. 

Now consider the following linear program: 

min Y\px{x)6 py (x), 

x (8) 

s.t. Px( x ) < Px, Vxe {0,1}". 

Let p* x be an optimal distribution that minimizes the objective 
of®. Since the linear program must achieve its optimal value 
at the extreme points, there must be |_—L J sequences x with 
p x (x) = Px and one sequence z withp^(z) = 1— \_-p^\Px- 
For any other sequence x, we have p\-(x) = 0- 
We then have 

Yp*x(x)9 py (x) <Y Px(x)0 py (x) < nc, 

X X 

and 

\ 1 / n / \ V" 

max p* x (x) max p Y (y) J = l P x ■ max p Y (y) J = P n 

Therefore, p* x and p Y also obtain the minimum objective value 
P n in Problem Q] 

Let Sx be the support of p* x . We have | Sx \ = [-p=], and 
for any x £ Sx, 0 PY ( Z ) > 0 pv (x). Let px be the uniform 
distribution over 5x\{z}. Notice for all x £ S/t/jz}, 

Px(x) > Px ( x ), 

Y (px( x ) - Px( x )) = P*y( z )- 

xGSx\{z} 


and 


We have 


= l(mm{l-2c,2c/(2"-l)}) 1/ " 

= 5 (1 - 2C),/ ” < / 

where the second equality follows from c < 1/4 and the last 
inequality follows from c > 0. 


^p. Y ( x )py(y)x: T y - ^p* Y ( x )py(y) xT y 

x,y x,y 

Y (px( x ) - P*x( x ))0p Y ( x ) - Px( z )8 P y ( z ) 

xGSx\{z} 

< Y fe’M - P*x( x ))0py ( z ) - Px( z Wpy ( z ) 

i6Sx\{z) 

= 0. 


Thus 


Yp x ( x )p y (y) xT y < Y p x( x )pr(y)x T y < nc. (9) 


x,y 


x,y 
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Let Pt = min pXiPi , (max x px(x)max y pyiy)) 1 ^' such 
that px and py satisfy the constraint of Problem [T| and px is 
uniform over its support. We have 


Pn < Pi < 


< 


< 


maxpx (x) maxpy(y) 
x y 


1/n 


—- max p Y (y) 

[l/Px\ y 


1/n 


\l/Px\-l y 


3 Px maxpr(y) 
y 


maxpr(y) 


1/n 


1/n 


3 1/n P„ 


where the second inequality follows from px and py satisfy 
the constraint of Problem Q] ( see 0), and the last inequality 
follows from 0 < Px < 1 and Lemma [4] (to be proved later 
in this section). Therefore, limn^oo P\/P n = 1. 

Similar technique can be used to show that 
lirn^oo P'JPl = 1, which completes the proof of this 
theorem. Specifically, suppose that px,PY on {0, l} ra 
achieve P\ where px is uniform on its support. Define 
Py = max y py(y) and Px = max x px(x). Similar to the 
above argument, there exists distribution Py such that 

1) for [t^J sequences y, py( y) = Py, for another one 
sequence yo, Pyiyf) = 1 — L-p^rJPr, and for all other 
sequences y, p Y (y) = 0; 

2 ) E x ,yP^( x )py(y) xT y < Ex, y p*( x )My) xT y < 

nc; and 

3) (max x p. Y (x)max y p^(y)) 1 /" = ( P x Py) 1,n . 

Let the support set of distributions p Y be Sy, and let py 
be the uniform distribution over 5V\{yo}- Similar to the 
reasoning of 0, we have 


J^Px(x)py( y)x T y < J^p x (x)p^(y)x T y < nc. 
x,y x,y 


Again, according to Lemma 0] 


pt < p' < 


< 

< 


/ \ 1/n 
I P. Y maxp Y (y) I 


Px 

Px 


1 \ 1/n 

Li /Py\) 

l 


IV-Prl - 1 
(3P x Py) 1/n 
^3Pt, 


1/n 


and hence lim^oo P'/P* = 1. 


Lemma 4. For every x £ (0,1), 

^(ri/trl - 1) > I. 

Proof: If x > |, then 

x{\l/x\ - 1) > X > i. 


If a; < i, then 


x{\l/x\ 


1) > x(\/x 


2)>l-2x> 


1 

3' 


C. Proof of Theorem [7] 

We first show that we only need to consider Sx and Sy 
with profiles a, b £ [0,0.5]". Suppose that for some i we have 
a,i > 2. We obtain a new set S' x by flipping the i-th bit of 
all vectors in Sx- Let a' = T(S' X ). We have a' k = a k for 
k i and a' = 1 a, . We know from Lemma [2] that for 
the constraint 0 still holds with S' x in place of Sx since 
a[ < 0.5 < a,i. While the objective function of Problem [2] 
with S'x in place of Sx does not change since ISjJ = |Sx|. 
Similarly we can modify Sy such that all b, < 

Without the loss of generality, we assume d\ > d 2 > 
••• > a n . Otherwise we just change the order of the bit in the 
string. Now we put bi,...,b n in a non-decreasing reordering 
as: b[ < ... < b' n . There must exist set S' Y C {0,1}" such 
that T(S'y) = (by ...,b' n ) T by changing the order of the bits 
for each string in set Sy. Then we have 


n|5 x ||5^| 


x£Sx,y£S' Y 


x T y = Y J 

1=1 


»=i 


cub's < y ’ ajbi < c. (10) 


The proof is completed by | Sx\\S' Y \ = |Sjfj|Sy|. 


D. Proof of Theorem 0] 

The logarithm in this proof has base 2. Consider subset 
S C {0,1}" with T(S) < d. Define a random vector X = 
(Xi, X 2 , ■ ■ ■, X n ) over {0,1}" with support S and Pr{X = 
x} = jjy for each x £ S. Recall that the i-th component 
of x £ {0,1}" is denoted by x*. Let l k = [^ 7 ] for k = 
0,1,...,to. Since (E[Xi],E[Xj],.. ,,E[X„]) = f{S) < a, we 
have for k = 1,... ,m and i = 1,..., h~lk- 1 , E[Xz fc _ 1+i ] = 
J fr(S) C k ~n +Z ) ^ /a( 4 ~» +i ) = a k . Note that Xj is a binary 
random variable. Hence the entropy H(Xi k _ 1+ i ) < hb(a k ) 
for k = 1,..., to and i = 1,..., l k — h-i- Therefore, 

m lk—lk—1 

log\S\ = H(X) < J2 E H ( X h-r+i) 

k=1 i=l 


m 

^ ^ Ik— l)^'b(oLfc) 

k=1 



where the last inequality follows from 1^ —Ik -1 < — + 1 and 
o(l) tends to zero as n tends to oo. Since the above inequality 
holds for all subset S C {0, l} n with T(S) < a , we have 

14(a) < 2%^ hb{ai)+oW) . 
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E. Proof of Theorem 0 

Let a = r(Sx), b = T^y)- By Theorem [3] it is sufficient 
for us to consider Sx and S Y such that 0.5 > a\ > ... > 
a n > 0 and 0 < b\ < ... < b n < 0.5. Hence f a is decreasing 
on [0,1], and /& is increasing on [0,1]. 

Define two m-profile s a and a such that for 1 < z < to, 

_ r mfa (^)l _ L mfa (£)J 

a i — • — 

m m 

We have / a and /„ are decreasing on [0,1]. 

Lemma 5. a< a < a. 

Proof: Notice that f a is a decreasing function. For every 
0 < r < 1, 


fa{r) = 0|> TO -| > fa > fair), 

and similarly, 

fair) = a frm] < f a < fair). 


Thus a < a < a. 

Define two m-profile s b and b such that for 1 < i < to, 


T \ m fb{^)) , L mhi^)\ 

O r — -Hi ~ 


m 


m 


We have /j and fb are increasing on [0,1], and similar to 
Lemma [5] we have the following lemma. 

Lemma 6. b<b<b. 

Now we can prove the following lemma. 

Lemma 7. For m> 2, 


1 

m 


< 


( 11 ) 


i =1 


E aA - — y aA 

n z ^ 

i= 1 

Proof: Observe that 


^ m i n ^ m 

— V aA - y aA = — y aA - / fa(t).fb{t)dt 

m ^' ri £—^ m ^^ / 


i= 1 


i= 1 


2=1 


^ m „ i 

<— y^aA- fa{t)fb{t)dt 

171 tl Jo ~ ~ 

^ m ^ m 

= — y aA — y a A, 

m m L L 


where in the first and last equality we apply Lemma0 The first 
equality comes from the fact that f a and J), are step functions. 
By definition, we have for 1 < z < to — 1, ma^ > tocL+i — 1 


and mb i+1 > mbi — 1. Hence 


^ m ^ m 

— y aA y aA 

m ^^ m ' 


1 m 1 m —1 

<—E a a — y 

m z ' m z ' 

2=1 2=2 


— — ( CL±bi + a2^2 + ^ ai(bi — bi— 2 )+ 

m \ “ 


m 


CH+i -) ( bi — 1 - 


2=3 


E 

2=2 


di+1 


m to 


< — 0.25 + 0.25 + y 0.5(6; - 6,;_ 2 )+ 

m \ ^ 


i =3 


m— 1 

E 

i=2 


0.5 0.5 
to to 


m 

2 

< 

TO 


1.5-b 0.5 b m + 0.5b m -i — 0.5br> — 0.56i 

m 


where we use the fact that d;,6; < 0.5. ■ 

By Lemma [3 and the condition of the theorem (using the 
form given in Lemma 0, we have 

y_- i^ 2 2 

— > aA < - ) aA-\ -< CQ H-. (12) 

TO z ' n z —' TO TO 

2=1 2=1 

From Lemma 0 and Theorem 0 we know that 

\S x \\S Y \ = V n ia)V n (b) 

< V n {a)V n (b) 

< (hb(oi)+/ lb (6i))+o(l)) 


where o(l) —> 0 as n —> oo. For 0 < t < 0.25, define 

fit) = max (h h (x) + K (13) 

2t<x<± V \ X JJ 

Some properties of the above function are given in Appendix HU 
(see Lemma 0-0. We have 


1 

TO 


m 

y (ftb(di) + hbA)) 
2=1 


^ lit 

1 E 

m z ' 

2=1 


^b(Oi) + K 




E fi a A ) < / 

i=l 



where the first inequality follows from the definition of f(a j), ) 
and the second inequality is obtained by applying (fl2l > and 
Lemma [10] 

Thus for any sufficiently large to, 

liminf — - > 2~fi CQ+ ™). 

n ^°° y/\S x \\S Y \ 

Take to —> oo we have 

liminf 1 > 2- /(cq) = (14) 

n ^°° \/|Sy||<SV| 

where the last equality is implied by Lemma 0 
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F. Proof of Theorem [6] 

For every n, let 

Sx = Sy = {x£ {0, l} n : x includes at most n^fcQ Is}. 
Then 

[n^CQ\ 

|S_y| = \Sy\ = J2 ( J = 2 n ( /lb (v^Q)+°( 1 )), (15) 

where o(l) —> 0 as n —> oo. Thus 

lim 1 = ___ = 4 -Mv/c q ) 

lj/|S A -||5y| 2 2hb ^) 

From the constructions of Sx and Sy, we know that 


F (Sx) = r (Sy) < (Vcq, y/CQ,--- >y /CQ). 

Therefore 
1 

n 


E ,o L , xT y = -(r(Sx)) T r(Sy) 


xeSx ,yeSV 


\Sx\\S Y \ 


n 




= C Q- 


Thus Sx and ,SV satisfies constraints in Theorem U 


IV. Concluding Remarks 
In this paper, we determine for Problem Q] that when c = cq 

lim P n = 4~ M ^ ) , (16) 

n—> oo 

which is of particular interest for quantum information. Note 
that our technique also shows that (IT6l > holds for cq < c < 
1/4. However, the existing technique in this paper does not 
imply CD for c < cq, which holds if we can show that 
f(t) (defined in (O) is concave in [0,0.25]. But we can only 
show the concavity of f(t) for the range [0.0625, 0.25] (see 
Appendix Hill. Whether /(f) is concave in [0, 0.25] is of certain 
mathematical interest. 
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Appendix I 

Background of the Optimization Problem 


A. CHSH Inequality 

A Bell test experiment has two spatially separated parties, 
Alice and Bob, who can randomly choose their devices set¬ 
tings X and Y from set {0,1} and generate random output 
bits A and B, respectively. The Clauser-Horne-Shimony-Holt 
(CHSH) inequality is that 


S™ := 


E (-l) a ® b+xy qAB l x Y (a,b\x,y)<2, 


a,b,x,y£{ 0,1} 


( 17 ) 


where © denotes the exclusive-or of two bits, and 
<lAB\XY( a i b\x, y) is the probability that outputs a and b are 
generated when the device settings are x and y. To simplify the 
notations, we may also write qAB\xy(a , b\x, y) as q(a, b\x, y), 
and use the similar convention for other probability distribu¬ 
tions. The theory of quantum mechanics predicts a maximum 
value for S of Sq = 2\[2. 

In a local hidden variable model (LHVM), assume that an 
adversary Eve controls a variable A taking discrete values so 
that 

q(a, b\x, y) = E A )9(%, A)g(A|x, y), 

A 

where q(a\x,A) (resp. q(b\y,A)) is the probability that a is 
output when the setting of Alice (resp. Bob) is x (resp. y), 
and q(A\x,y) is the conditional probability distribution of the 
variable A given x and y. Free will is assumed in the derivation 
of the CHSH inequality, i.e., 


q(X\x,y) = q(X). (18) 

With this assumption, the inequality 1X7] ) holds for any LHVM. 

We consider the case that the device settings may not be 
chosen freely, i.e., (ITSl i may not hold. By the Bayes’ law, 

9 (A|x, y) = = 4q(x, y\\)q(X), 

q{x, y) 

where q(x,y) is assumed to be 1/4 so that Alice and Bob 
cannot detect the existence of adversary Eve. In this case, 

S=Y / S M A), (19) 


S a = 4 Y, {—I)°® b+xv q(a\x, \)q(b\y, X)q(x, y\A)- 
a,fe,fc,yG{ 0,1} 


The adversary can pick probabilities q( A), q(x, y|A), q(a\x, A) 
and q(b\y,A) to fake the violation of a Bell’s inequality. 

The following randomness measure are used in literature 

os, m. os 

P = ma xq(x, y |A). 

x,y,\ 

Note that P takes values from 1/4 to 1. When P = 1/4, 
all the device settings are uniformly picked independent of A. 
When P = 1, for at least one value of A, the device settings 
are deterministic. 

We are interested in the minimum value of P such that 
S > Sq for certain LHVMs in the independent device setting 
scenario, i.e., q(x,y |A) = q , (x|A)g(t/|A). In other words, we 
want to solve the following problem 


min max Xt y t xq{x,y\\) 
s.t. Ea^a^A) > Sq, 
'E\q{x,y\X)q(X) = 
q{x,y\X) = q(x\X)q{y\X), 


( 20 ) 


where the minimization is over all the possible (conditional) 
distributions q( A), q(a\x, A), q(b\y, A) and q(x,y\X) with 
q{x,y |A) = q{x\X)q(y\X). Due to the convexity of the con¬ 
straints with respect to q(a\x, A) and q(b\y, A), we can consider 







only deterministic distributions q(a\x, A) and q(b\y, A) without 
changing the optimal value of < [20b . Let a = a(x, A) and 
b = b(y, A). Rewrite 

S x = 4 Y (-l) a{x,X) ® b{v ’ X]+xv q{x,y\\). ( 21 ) 

*ij/e{o,i} 

In the above formulations, only a single run of the test 
is performed. It is more realistic to consider that the device 
settings in different runs are correlated, which is referred 
to as the multiple-run scenario, where the device settings 
x = (x!,...,x n ) T and y = (yi,...,y n ) T in n runs of 
the tests follow a joint distribution q(x, y|A). Similar to the 
discussion of the single-run scenario, for multiple runs, we 
have the CHSH inequality S^</(A) < 2 with 


table m 

Output function assignment. 


A 

a(0, A) 

a(l, A) 

6(0, A) 

6(1, A) 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

2 

0 

1 

0 

0 

3 

0 

1 

1 

0 


TABLE IV 

Assignment of the device setting distributions. 


A 

9(0,0|A) 

9(0,1|A) 

9(1,0|A) 

9(1,1|A) 

0 

9* (0,0) 

9*(0,1) 

9^175) 

9*(1-1) 

1 

9*(1,0) 

9*(1,1) 

9* (0,0) 

9* (0,1) 

2 

9* (0,D 

9* (0,0) 

9*(1,1) 

9* (1,0) 

3 

9*(1,1) 

9* U,0) 

9* (0,1) 

9* (0,0) 


A ,L 

S^ ] = - Y q(x 1 y\X)Y(- 1 ) aixi,X)myi,X)+XiVi 

x,yg{0,l} n i=l 

= 4 £ 9 ( x j y I A) ^ 7 r( 0 , 0 |x, y) (—i) a (o,A)®&(o,A) 

x.yG{0,l}" 

+7r(0, 1|x, y)(— l) a (°' A )® 6 ( 1 A) 

+7r(l, 0 |x, y)( — l)“( 1 ,A)©b(0,A) 

—(— 7 T(1 5 1 |x, y)(_l)“( 1 A)eb(l,A) + l 

= 4 Y {-l) a{x ’ X) ® Hv ’ X)+xv n{x,y\\) ( 22 ) 

X,ye{ 0 ,l} 

where n(x,y\x,y) is the fraction of (x,y) pairs among the 
pairs (xk,yk), k = l,...,n, and 

n(x,y\X)= Y q(^y\^(x,y\^,y)- 

x,ye{o,i}" 

Note that (l22l > shares the same form as OH . 

Define the measure of measurement dependence for multi¬ 
ple runs as 

/ A 1/n 

p( n ) = f maxg(x, y|A) j 


(a,b). Table HU lists the eight possible output functions with 
a(0, A) = 0. It is not necessary to consider the other eight 
possible output functions with a(0, A) = 1 since they give the 
same set of S\ as listed in the last column in Table QI] Since 
the output functions with index 1,2, 3,4 are better than the 
output functions with index 5, 6 , 7, 8 , respective, we use the 
former four choices of the output functions. 

With the choices of the output functions as specified above, 
the constraint q(x, y\X)q(X) = \ is redundant. To show 
this, we consider a LHVM (denoted by L*) with a constant 
A, and output functions a*(x) = b*(y) = 0. (Other choices of 
a*[x) and b*(y) can be shown similarly.) We use q*(x,y ) to 
denote the device setting distribution related to this LHVM. 
Define a new LHVM (denoted by L) with A = 0,1, 2, 3 and 
q(X) = 1/4 as follows: The output functions are assigned 
according to Table |III1 and the device setting distributions are 
assigned according to Table |IV] It can be verified that 


P = 


max 

sc,2/e{0,l},A={0,l,2,3} 


q(x,y\X) 


= max q*(x,y), 
0 , 1 } 


Under the independent device setting condition that 
g(x,y|A) = g r (x|A)g(y|A), the problem of interest now 
becomes 

( \ 1/n 

min I maxgfx, y|A) 1 

s - 1 - 'Y s x' ) qW> s Q (23) 

EA9(x,y|A) 9 (A) = i, 
g(x,y|A) = g(x|A)g(y|A), 

where S A "' ) is defined in ( 1221 . Note that when n = 1, (l23l > 
becomes 

B. Simplification 

We use the case n = 1 to illustrate how to simplify the 
above optimization problem. 

First, we determine the choice of the output functions 
a(x , A) and b(x, A) using the approach in fl 8 l . For a give value 
of A, there are totally 16 different pairs of the output functions 


and 

S= Y <?( A ) 4 51 {-l) a{x ’ X) ® Kv ' X)+xv q{x,y\\) 

A={0,1,2,3} x,y6{0,l} 

= q*(0, 0 ) + q*(0, 1 ) + q*(l, 0 ) — q*(l, 1 ). 

Hence, if LHVM L* achieves the optimal value of < [20b . so 
does LHVM L, which has q(x,y) = 1/4. 

Further, for each of the four pairs of output functions with 
index 1,2, 3,4 in Table m the corresponding S\ involves 
only one summands with negative coefficient. Since the four 
probability masses g(0,0|A), g(0,1|A), g(l,0|A) and g(l,l|A) 
are symmetry, these four pairs of output functions achieve the 
same optimal value. Here we use a(x, A) = b(y , A) = 0 so 
that 

^4 1 ) g (A)=4-8 g .xy(l,l)- 

A 

With these simplifications, the above minimization problem 
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TABLE II 

Output function assignment. 



a(0, A) 

a(l, A) 

6(0, A) 

6(1, A) 

S A /4 

1 

0 

0 

0 

0 

9(0, 0| A) + 9(0,1|A) + 9(1, 0| A) - q ( 1,1|A) 

2 

0 

0 

0 

1 

9(0, 0|A) - 9 (0,11 A) + 9(1, 0|A) + 9(1,1|A) 

3 

0 

1 

0 

0 

9(0, 0| A) + 9(0,11 A) - 9(1, 0| A) + g(l, 1|A) 

4 

0 

1 

1 

0 

—9(0, 0|A) + q(0,11 A) + 9(1, 0|A) + q(l, 1|A) 

5 

0 

0 

1 

0 

-9(0, 0|A) + 9(0,11 A) - q(l, 0| A) - 9(1,1|A) 

6 

0 

0 

1 

1 

-9(0, 0| A) - 9(0,11 A) - 9(1, 0| A) + 9(1,1|A) 

7 

0 

1 

0 

1 

9(0, 0| A) - 9(0,11 A) - 9(1, 0| A) - 9(1,11 A) 

8 

0 

1 

1 

1 

-9(0, 0| A) - 9(0,11 A) + q(l, 0| A) - 9(1,11 A) 


becomes 

min ma,x Xt y t xq{x,y\X) 
s.t. (24) 

q{x,y\X) = q(x\X)q(y\X). 

For any A and c G [0,0.5], let P(c) be the minimum 
value of max J S q(x, y |A) such that <?( 1 , 11 A) < c, q(x, j/|A) = 
g(x|A)g(?/|A). Note that P(c) does not depend on the choices 
of A, and P(c) is a non-increasing function of c. It clear 
that if we use only a constant A in d24l >. the optimal value 
is P(—g- 2 -). Now we show that it is sufficient to consider a 
constant A. Suppose that q*(x,y\X) achieves the optimal value 
of (l24l) . Let c\ = < 7 * (1,11 A). By the first constraint of (l24l) . 
we have q*(X)c\ = ~ Q , which implies the existence of 
certain A* such that c A * < — g- 2 . By the definition of P(c), 
we have 

max 5 * (x, y\X) > P(c A ), 

which implies 


maxq*(x,y\X) > maxP(c A ) > P(c A .) > P((4- Sq)/8). 

\, x,y X 

In other words, using a LHVM with A taking multiple values 
cannot achieve smaller optimal value than P ( 4 8 9q ). Hence, 
it is sufficient to consider a constant A, and (l24l > becomes 

min ma x XiV q(x)q(y) 

s.t. q x (l)qY(l) < f Q 


Similar to the reasoning of the single-run case, we can use 
a deterministic strategy A with a(x, A) = b(y, A) = 0, and 
simplify problem d23t to 


min 

s.t. 


which is 


max </(x, y) 
V x,y 


1 /n 


l J2 ?(x,y)x T y< 

x,yG{0,l}" 

g(x,y) = q(x)q( y), 


4 -Sq 
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Appendix II 

Properties of a Function 


We study some properties of the function /(f) defined in 
(IT3]i . Recall that 


fit) 


max hb(x ) + fib 

2£<x < \ 



0 < t < 0.25. 


The next lemma implies that /(f) = 2hb{s/i) for 0.0625 < 
f < 0.25. 


hb{x) + K ( - ) < 2 fi b (v / f), 


Lemma 8 . For 0.0625 < f < 0.25, 2 1 < x < 0.5, we fiave 

where the equality holds for x = y/t. That is f(t) = 2hb(y/t) 
for t G [0.0625, 0.25]. 

Proof: Fix f. Let u(x) = hb{x ) + fib (|r). Observe that 
u(x) = u (j). Thus it suffices to show u(x) < 2hb(Vt) for 
2 1 < x < \ft. Taking derivative on u we have 

u'(x) = -loga;+log(l-x) + -^log ( — )-^ log ( 1 — — J 

X \ X J X \ X J 

Let v(x) = —xlogx + xlog(l — x), we have 

xu(x) = v(x) — v ^ (25) 


From f > we have 


f 1 1 

- > - — x > -. 

x 2 4 


(26) 


We may verify that v is decreasing on [0.25, 0.5]. If x > 0.25, 
then xu'(x) > 0 since x < —. Otherwise, we may verify 
v(x) > u(0.5 — x) for x < 0.25. Then apply ( |26| > to ( |251 ) we 
have 

xu'(x) = v(x) — v (^j — v ( x ) ~ v (0-5 — x) > 0 (27) 

Therefore u is an increasing function on [2f, \ftf which 
implies u(x) < 2 fib(v/f). ■ 

Lemma 9. Function /(f) is increasing on [0,0.25]. 


Proof: To show that / is increasing, fix any 0 < fi < 
f 2 < 0.25. We write /(fi) = fib(xi) + fib(j/i) where Xi 
maximizes fib(x) + fib (^r) for x G [2fi,0.5] and xiyi = fi. 
We know that 0 < x\ , y-\ < 0.5. Find x ’2 and 1./2 such that 
Xi < X 2 < 2 /i < y -2 < \ such that x-iyi = ^ 2 - Therefore 

/(ft) = hb(xi) + h b (yi) < fib(x 2 ) + fib(2/2) < /(f2). 


Lemma 10. For any d > cq = 2 f ri « 0.1464, if k real 
numbers f 1 , f 2 , * * • , ffc G [0,0.25] siicfz ffiaf < d, 


we have 

k 

lJ2.au) <f(d). 

i —1 
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Proof: Let fo(t) = 2/ib (Vt), 0 < t < 0.25. From 
Lemma [8] /(f) = fo(t) for t > 0.0625. Let /i be the tangent 
line of /o on (0.14,/o(0.14)). Notice that h\,(x) and \fx are 
both concave on their domains. We see that /o(f) is also 
concave on [0, 0.25]. Observe that fo is concave and increasing 
on M] , we have fi is an increasing function, while for every 
fe [0,0.25],/o(f) </i(i). 

Let g(t) be a function defined on [0,0.25] such that 

fi(t) 0 < f < 0.14; 
f 0 (t) 0.14 < t < 0.25. 

Observe that g is linear on [0,0.14] and concave on 
[0.14, 0.25], thus g is concave on [0, 0.25]. For 0 < t < 0.0625, 

f(t ) < /(0.0625) 

= /o(0.0625) (= 1.623) 

< g( 0) (= 1.630) 

< a(t)- 

For 0.0625 < t < 0.25, 

f{t) = fo(t ) < g(t). 

Thus g is always not smaller than /. Take AAr- i A — 0-25 
such that ti < t\ for all 1 < i < k, while ^ Si=i A = c ' ■ 
Applying Jensen’s inequality we have 

\tm< 


where the first inequality holds since / is increasing, the 
second inequality holds since g is always no less than /, and 
the last equality follows from d > cq > 0.14. ■ 


2—1 

1 k 


2=1 
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